Journal: International Journal of Biomedical Imaging
Article Title: Facile Conversion and Optimization of Structured Illumination Image Reconstruction Code into the GPU Environment
doi: 10.1155/2024/8862387
Figure Lengend Snippet: Impact of improved hardware and code on algorithm execution performance. We first measured the elapsed time of the vanilla code with a given image as a performance baseline from all the machines. Then, we applied each approach independently and measured the elapsed time to show the performance improvement of each approach. We applied all approaches with the single CPU core in the CPU-Single-core-All case. The CPU-Multicores-Process and CPU-Multicores-Threads cases show elapsed time when each task is executed in different cores without applying other approaches (red bar number 1). We applied all the approaches including multicores with which six tasks are executed in different cores, i.e., the CPU-Multicores-All case, which shows the best performance without exploiting the GPU (red bar number 2). The GPU-gpuArray case shows the elapsed time when we utilize the GPU by using gpuArray() function only without applying other approaches. This case clearly shows that performance improvement is limited even with the GPU if the code is written inefficiently. The GPU-All (Script-dup) case, the GPU-All (Script-non-dup) case, and the GPU-All (Func-dup) case show the benefits of avoiding duplicated operations and utilizing functions instead of scripts. While the performance improvement from these approaches was marginal in CPU-only code, they affect overall execution time significantly in GPU-optimized code when the execution time is less than a second. The GPU-All case shows the elapsed time with all approaches that we introduced in this work, and the best performance we can achieve (bottom red bar in each image panel).
Article Snippet: While using MATLAB's built-in functions including the gpuArray() function to easily exploit the GPU to produce a performance improvement, the performance gain could be limited due to inefficiently written code.
Techniques: